Search CORE

119 research outputs found

FrameDP: sensitive peptide detection on noisy matured sequences

Author: FUKUNISHI
J. Gouzy
Journet
Lottaz
S. Carrere
Schiex
T. Schiex
Wasmuth
Publication venue: Oxford University Press
Publication date: 01/01/2009
Field of study

Summary: Transcriptome sequencing represents a fundamental source of information for genome-wide studies and transcriptome analysis and will become increasingly important for expression analysis as new sequencing technologies takes over array technology. The identification of the protein-coding region in transcript sequences is a prerequisite for systematic amino acid-level analysis and more specifically for domain identification. In this article, we present FrameDP, a self-training integrative pipeline for predicting CDS in transcripts which can adapt itself to different levels of sequence qualities

Crossref

PubMed Central

ProdInra

Soft Concurrent Constraint Programming

Author: Bella G.
Bella G.
Bistarelli S.
Bistarelli S.
Bistarelli S.
Boer F. D.
Chen S.
De Nicola R.
Dubois D.
Fargier H.
Francesca Rossi
Schiex T.
Scott D.
Stefano Bistarelli
Ugo Montanari
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2002
Field of study

Soft constraints extend classical constraints to represent multiple consistency levels, and thus provide a way to express preferences, fuzziness, and uncertainty. While there are many soft constraint solving formalisms, even distributed ones, by now there seems to be no concurrent programming framework where soft constraints can be handled. In this paper we show how the classical concurrent constraint (cc) programming framework can work with soft constraints, and we also propose an extension of cc languages which can use soft constraints to prune and direct the search for a solution. We believe that this new programming paradigm, called soft cc (scc), can be also very useful in many web-related scenarios. In fact, the language level allows web agents to express their interaction and negotiation protocols, and also to post their requests in terms of preferences, and the underlying soft constraint solver can find an agreement among the agents even if their requests are incompatible.Comment: 25 pages, 4 figures, submitted to the ACM Transactions on Computational Logic (TOCL), zipped file

arXiv.org e-Print Archive

CiteSeerX

Crossref

Archivio della Ricerca - Università di Pisa

Archivio istituzionale della ricerca - Università di Padova

Modelling the conference paper assignment problem

Author: de Givry S.
Schiex T.
Schutt A.
Simonis Helmut
Publication venue
Publication date: 01/09/2020
Field of study

In this paper we describe different constraints and models for the conference paper assignment problem. While the core problem is a simple flow problem, additional constraints often arise to tailor a solution to specific wishes, or to increase perceived fairness for reviewers and/or submissions. We show some results from actual conferences paper assignments, and also investigate scalability of the method for large-scale events

Irish Universities

Cork Open Research Archive

Many-Valued Institutions for Constraint Specification

Author: BC Pierce
D Benavides
D Sannella
DA Cohen
H Herrlich
I Ţuţu
JA Goguen
JL Fiadeiro
JL Fiadeiro
JL Fiadeiro
JL Fiadeiro
JL Fiadeiro
M Aiguier
M Wirsing
MM Hölzl
N Galatos
PD Mosses
R Diaconescu
S Bistarelli
S Bistarelli
S Bistarelli
S Bova
T Mossakowski
T Schiex
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

We advance a general technique for enriching logical systems with soft constraints, making them suitable for specifying complex software systems where parts are put together not just based on how they meet certain functional requirements but also on how they optimise certain constraints. This added expressive power is required, for example, for capturing quality attributes that need to be optimised or, more generally, for formalising what are usually called service-level agreements. More specifically, we show how institutions endowed with a graded semantic consequence can accommodate soft-constraint satisfaction problems. We illustrate our approach by showing how, in the context of service discovery, one can quantify the compatibility of two specifications and thus formalise the selection of the most promising provider of a required resource.Peer Reviewe

Crossref

UPCommons. Portal del coneixement obert de la UPC

Royal Holloway - Pure

University of Dundee Online Publications

Deciphering the genome structure and paleohistory of _Theobroma cacao_

Author: Ang&#xe9
Ange Marie Risterucci
Anne Dievart
Aur&#xe9
Bertrand Pitollat
Christopher Viot
Claire Lanaud
Cristian Chaparro
Dave Kudrna
Didier Clement
Diogenes Infante
Dominique Brunel
Emmanuel Guiderdoni
Erika Sallet
Florent Murat
Francis Quetier
Francois Sabot
Gaetan Droc
Ismael Kebe
Jean Marc Aury
Jerome Gouzy
Jerome Salse
Jetty Siva S. Ammiraju
John E. Carlson
Jose Fernandes Barbosa-Neto
Joseph Moroh Akaza
Julie Poulain
Karina Gramacho
Laura Gelley
Maguy Rodier-Goud
Manuel Ruiz
Mark Guiltinan
Mathias Tahi
Mathilde Allegre
Melissa Kramer
Michael Abrouk
Michael Axtell
Michel Boccara
Mickael Bourge
Olivier Fouet
Olivier Panaud
Patrick Wincker
Pierre Costet
Rod Wing
Ronan Rivalan
Schiex T.
Siela Maximova
Spencer Brown
Stephan C. Schuster
Stephanie Sidibe-Bocs
Thierry Legavre
Valentin Guignon
W. Richard McCombie
Wolfgang Golser
Xavier Argout
Xavier Sabau
Xiang Song
Yolande Roguet
Yufan Zhang
Zhaorong Ma
Zi Sh
Publication venue
Publication date: 15/09/2010
Field of study

We sequenced and assembled the genome of _Theobroma cacao_, an economically important tropical fruit tree crop that is the source of chocolate. The assembly corresponds to 76% of the estimated genome size and contains almost all previously described genes, with 82% of them anchored on the 10 _T. cacao_ chromosomes. Analysis of this sequence information highlighted specific expansion of some gene families during evolution, for example flavonoid-related genes. It also provides a major source of candidate genes for _T. cacao_ disease resistance and quality improvement. Based on the inferred paleohistory of the T. cacao genome, we propose an evolutionary scenario whereby the ten _T. cacao_ chromosomes were shaped from an ancestor through eleven chromosome fusions. The _T. cacao_ genome can be considered as a simple living relic of higher plant evolution

Crossref

Nature Precedings

Survey sequencing and radiation hybrid mapping to construct comparative maps.

Author: C. Hitte
C. Hitte
C. Priat
C. S. Mellersh
C. S. Mellersh
C. Soderlund
E. A. Ostrander
E. E. Thomas
E. F. Kirkness
E. H. Margulies
F. Senger
F. Vignaux
F. Vignaux
F. Yang
F. Yang
K. Lindblad-Toh
M. Boehnke
M. Breen
M. Breen
M. Breen
M. Breen
M. Breen
R. Agarwala
R. Guyon
R. Guyon
S. F. Altschul
S. J. O’Brien
T. C. Matise
T. Schiex
W. J. Murphy
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

In MURPHY WJ (ed.) Phylogenomics, Humana Press. (Methods in Molecular Biology, 422)International audienceRadiation hybrid (RH) mapping has become one of the most well-established techniques for economically and efficiently navigating genomes of interest. The success of the technique relies on random chromosome breakage of a target genome, which is then captured by recipient cells missing a preselected marker. Selection for hybrid cells that have DNA fragments bearing the marker of choice, plus a random set of DNA fragments from the initial irradiation, generates a set of cell lines that recapitulates the genome of the target organism several-fold. Markers or genes of interest are analyzed by PCR using DNA isolated from each cell line. Statistical tools are applied to determine both the linear order of markers on each chromosome, and the confidence of each placement. The resolution of the resulting map relies on many factors, most notably the degree of breakage from the initial radiation as well as the number of hybrid clones and mean retention value.A high-resolution RH map of a genome derived from low pass or survey sequencing (coverage from 1 to 2 times) can provide essentially the same comparative data on gene order that is derived from high-coverage (greater than x7) genome sequencing. When combined with fluorescence in situ hybridization, RH maps are complete and ordered blueprints for each chromosome. They give information about the relative order and spacing of genes and markers, and allow investigators to move between target and reference genomes, such as those of mouse or human, with ease although the approach is not limited to mammal genomes

Crossref

PubMed Central

HAL-Rennes 1

Linkage mapping bovine EST-based SNP

Author: A Braun
A Everts-van der Wind
BT Page
C Li
E Casas
EE Connor
GD Schuler
JL Williams
LB Rowe
MD Bishop
ML Clawson
MS Ashwell
N Ihara
P Green
P Havlak
RT Stone
RT Stone
SF Altschul
SM Kappes
SP Wilder
T Schiex
TD Thue
TP Smith
W Barendse
WJ Kent
WM Snelling
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: Existing linkage maps of the bovine genome primarily contain anonymous microsatellite markers. These maps have proved valuable for mapping quantitative trait loci (QTL) to broad regions of the genome, but more closely spaced markers are needed to fine-map QTL, and markers associated with genes and annotated sequence are needed to identify genes and sequence variation that may explain QTL. RESULTS: Bovine expressed sequence tag (EST) and bacterial artificial chromosome (BAC)sequence data were used to develop 918 single nucleotide polymorphism (SNP) markers to map genes on the bovine linkage map. DNA of sires from the MARC reference population was used to detect SNPs, and progeny and mates of heterozygous sires were genotyped. Chromosome assignments for 861 SNPs were determined by twopoint analysis, and positions for 735 SNPs were established by multipoint analyses. Linkage maps of bovine autosomes with these SNPs represent 4585 markers in 2475 positions spanning 3058 cM . Markers include 3612 microsatellites, 913 SNPs and 60 other markers. Mean separation between marker positions is 1.2 cM. New SNP markers appear in 511 positions, with mean separation of 4.7 cM. Multi-allelic markers, mostly microsatellites, had a mean (maximum) of 216 (366) informative meioses, and a mean 3-lod confidence interval of 3.6 cM Bi-allelic markers, including SNP and other marker types, had a mean (maximum) of 55 (191) informative meioses, and were placed within a mean 8.5 cM 3-lod confidence interval. Homologous human sequences were identified for 1159 markers, including 582 newly developed and mapped SNP. CONCLUSION: Addition of these EST- and BAC-based SNPs to the bovine linkage map not only increases marker density, but provides connections to gene-rich physical maps, including annotated human sequence. The map provides a resource for fine-mapping quantitative trait loci and identification of positional candidate genes, and can be integrated with other data to guide and refine assembly of bovine genome sequence. Even after the bovine genome is completely sequenced, the map will continue to be a useful tool to link observable phenotypes and animal genotypes to underlying genes and molecular mechanisms influencing economically important beef and dairy traits

Crossref

Springer - Publisher Connector

PubMed Central

HMM-FRAME: accurate protein domain classification for metagenomic sequences containing frameshift errors

Author: A Kislyuk
AL Delcher
C Quince
C Wang
DT Gibson
E Birney
E Halperin
H Noguchi
I Antonov
K Karplus
M Borodovsky
M Girdea
M Larkin
M Pellegrini
M Peltola
M Rho
N Brown
N Eriksson
R Durbin
R Edwards
S Altschul
S Iwai
T Schiex
W Li
WI Chang
X Guan
Yanni Sun
Yuan Zhang
Z Zhang
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Protein domain classification is an important step in metagenomic annotation. The state-of-the-art method for protein domain classification is profile HMM-based alignment. However, the relatively high rates of insertions and deletions in homopolymer regions of pyrosequencing reads create frameshifts, causing conventional profile HMM alignment tools to generate alignments with marginal scores. This makes error-containing gene fragments unclassifiable with conventional tools. Thus, there is a need for an accurate domain classification tool that can detect and correct sequencing errors. Results We introduce HMM-FRAME, a protein domain classification tool based on an augmented Viterbi algorithm that can incorporate error models from different sequencing platforms. HMM-FRAME corrects sequencing errors and classifies putative gene fragments into domain families. It achieved high error detection sensitivity and specificity in a data set with annotated errors. We applied HMM-FRAME in Targeted Metagenomics and a published metagenomic data set. The results showed that our tool can correct frameshifts in error-containing sequences, generate much longer alignments with significantly smaller E-values, and classify more sequences into their native families. Conclusions HMM-FRAME provides a complementary protein domain classification tool to conventional profile HMM-based methods for data sets containing frameshifts. Its current implementation is best used for small-scale metagenomic data sets. The source code of HMM-FRAME can be downloaded at <url>http://www.cse.msu.edu/~zhangy72/hmmframe/</url> and at <url>https://sourceforge.net/projects/hmm-frame/</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

MetWAMer: eukaryotic translation initiation site prediction

Author: A Delcher
A Hatzigeorgiou
A Nadershahi
A Pedersen
A Prats
A Rakotondrafara
A Sachs
A Salamov
A Zien
C Bishop
C Iseli
C Lottaz
C Mathé
D Abramczyk
D Cavener
E Birney
G Crooks
G Gremme
G Li
G Stormo
H Li
H Liu
H Liu
J Allen
J Allen
J Crow
L Balvay
L Xing
M de Hoon
M Hirosawa
M Kozak
M Kozak
M Kozak
M Kozak
M Medveczky
M Sparks
M Sparks
M Stanke
M Stanke
M Tech
M Tech
Michael E Sparks
Q Dong
S Altschul
S Hebsgaard
S Russell
S Salzberg
T Berardini
T Mitchell
T Nishikawa
T Preiss
T Schiex
T Schneider
T Sing
V Brendel
Volker Brendel
Y Saeys
Y Wang
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Translation initiation site (TIS) identification is an important aspect of the gene annotation process, requisite for the accurate delineation of protein sequences from transcript data. We have developed the MetWAMer package for TIS prediction in eukaryotic open reading frames of non-viral origin. MetWAMer can be used as a stand-alone, third-party tool for post-processing gene structure annotations generated by external computational programs and/or pipelines, or directly integrated into gene structure prediction software implementations. Results MetWAMer currently implements five distinct methods for TIS prediction, the most accurate of which is a routine that combines weighted, signal-based translation initiation site scores and the contrast in coding potential of sequences flanking TISs using a perceptron. Also, our program implements clustering capabilities through use of the <it>k</it>-medoids algorithm, thereby enabling cluster-specific TIS parameter utilization. In practice, our static weight array matrix-based indexing method for parameter set lookup can be used with good results in data sets exhibiting moderate levels of 5'-complete coverage. Conclusion We demonstrate that improvements in statistically-based models for TIS prediction can be achieved by taking the class of each potential start-methionine into account pending certain testing conditions, and that our perceptron-based model is suitable for the TIS identification task. MetWAMer represents a well-documented, extensible, and freely available software system that can be readily re-trained for differing target applications and/or extended with existing and novel TIS prediction methods, to support further research efforts in this area.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central